Modeling, Querying, and Mining Uncertain XML Data

نویسندگان

  • Evgeny Kharlamov
  • Pierre Senellart
چکیده

This chapter deals with data mining in uncertain XML data models, this uncertainty typically coming from imprecise automatic processes. We first review the literature on modeling uncertain data, starting with well-studied relational models and moving then to their semistructured counterparts. We focus on a specific probabilistic XML model, that allows representing arbitrary finite distributions of XML documents, and has been extended to also allow continuous distributions of data values. We summarize previous work on querying this uncertain data model and show how to apply the corresponding techniques to several data mining tasks, exemplified through use cases on two running examples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Uncertain Probabilistic Data Modeling

Uncertainty in data is caused by various reasons including data itself, data mapping, and data policy. For data itself, data are uncertain because of various reasons. For example, data from a sensor network, Internet of Things or Radio Frequency Identification is often inaccurate and uncertain because of devices or environmental factors. For data mapping, integrated data from various heterogono...

متن کامل

A Probabilistic Approach to XML Data Management

Uncertainty is ubiquitous in data and can take various forms. Usually, this is not formally taken into account: only the most likely data interpretation is kept for future processing, or all probable choices of correct information above a threshold are maintained. We claim this is not sufficient. There is a need for managing the imprecision in data more rigorously, and the current thesis addres...

متن کامل

Similarity Search and Mining in Uncertain Databases

Managing, searching and mining uncertain data has achieved much attention in the database community recently due to new sensor technologies and new ways of collecting data. There is a number of challenges in terms of collecting, modelling, representing, querying, indexing and mining uncertain data. In its scope, the diversity of approaches addressing these topics is very high because the underl...

متن کامل

Mining tree-based association rules from XML documents

The increasing amount of XML datasets available to casual users increases the necessity of investigating techniques to extract knowledge from these data. Data mining is widely applied in the database research area in order to extract frequent correlations of values from both structured and semistructured datasets. In this work we describe an approach to mine Tree-based association rules from XM...

متن کامل

An efficient XML query pattern mining algorithm for ebXML applications in e-commerce

Providing efficient query to XML data for ebXML applications in e-commerce is crucial, as XML has become the most important technique to exchange data over the Internet. ebXML is a set of specification for companies to exchange their data in e-commerce. Following the ebXML specifications, companies have a standard method to exchange business messages, communicate data, and business rules in e-c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010